home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Power Programmierung
/
Power-Programmierung (Tewi)(1994).iso
/
magazine
/
c_news
/
11
/
cnews011.nws
next >
Wrap
Text File
|
1988-09-09
|
64KB
|
1,781 lines
C News Vol. 1 Issue 11 Sept 15, 1988
*-------------------------------------------------------------*
| C NEWS - International C Newsletter, Compiler review, and |
| tutorial. |
*-------------------------------------------------------------*
Table of Contents
The Heap: Messages from the Editor ..........................2
by Barry Lynch
Memory Models ...............................................4
by Bill Mayne
Public Domain Software Review ..............................20
by Barry Lynch (ROFF)
C Software from the C BBS ..................................21
by Arnold Cherdak (FILECHK)
Article Submission Standards ................................23
Address's ...................................................24
Index .......................................................25
Distribution Points .........................................27
User Response Form ..........................................28
C News is an electronic journal published by the C BBS in
Burke, VA on a monthly basis. The subject for C News is the C
programming language, as well as any derivatives like C++.
All readers are encouraged to submit articles, reviews, or
comments for submission. C News is freely distributed, but
can not be sold for a profit, or cannot have a charge assessed
to cover distribution costs. To do so is in direct violation
of the License agreement. Copies of which are available from
the C BBS. This publication is Copyrighted under U.S
Copyright Law.
Page 1
C News Vol. 1 Issue 11 Sept 15, 1988
The HEAP: Messages from the Editor by Barry Lynch
Well, I am back after 10 days in the wilds of Canada, no
computers, sports or work to worry about. I highly recommend
that everyone take time off from there work and hobbies. It
is amazing what a little fresh air can do for the mind.
LACK OF FEEDBACK
While I was away I spent some time pondering the future
and status of "C News". One of the problems that I see: is
the lack of feedback on the articles and reviews contained
within C News. Numerous individuals have spent hours writing
code, and an accompanying article for no financial reward. Is
it too much to ask, to fill out the questionnaire and send it
in? And if you have questions about some of the code
included, why not take the time to ask the author?
C News is for you the reader, but it will not survive if
the communication is not two-way. I think that I can safely
say that some of the authors are a little miffed at why there
has not been any feedback. Please, take the time to offer some
comments.
Which brings me to another point: Letters to the Editor.
At both of the C BBS Users meetings held to date, a suggestion
has been placed on the table regarding a "Letters to the
Editor" column for C News. I agree that it would be a good
idea, but once again no letters. Let's at least see one per
issue!! <Or do I have to resort to "Creating" mail to start a
column!!>
While we are on the subject of writing and lack of
feedback. One of the individuals that has taken the time to
write for C News is: Bill Mayne. Bill currently has the lead
article in C news this month on memory models. Bill also has
an article in the September issue of "Computer Language". The
article is called: To Parse or not Parse. Bill explores the
use of the main() command line arguments. If you can, take
the time to read the article and offer Bill comments or
congratulations on a job well done. Bill is a regular user
here at the C BBS, and answers mail left to him as time
permits.
Another user that has taken the time to write is Arnie
Cherdak. Arnie wrote the article on Database's in C in Issue
10. The article was written to be used as an interactive
tutorial. Exercises were included, with the intended result
of readers sending in the code they wrote. To date no one has
commented on the article or shown any code. Once again, lack
of feedback.
Page 2
C News Vol. 1 Issue 11 Sept 15, 1988
ROFF
Well that is all for now on the subject of feedback I
think I have made my point. You may notice that C News has a
new format beginning with this issue. I have started to use
ROFF, a PC based version of the UNIX Roff utility. As I
become more familiar with the software, the format may change
yet again. The goal is to have an automatic C News or any
newsletter for matter - generator. Roff is part of the
answer, and some additional code is needed. Suffice it to say
that the use of Roff will provide a neater end product, and
free up more of my time for editing and writing. If anyone is
interested in Roff, I have two versions of the program here on
the C BBS. See Public Domain Software Reviews for more
information.
Page 3
C News Vol. 1 Issue 11 Sept 15, 1988
CHOOSING A MEMORY MODEL by Bill Mayne
ABSTRACT: The meaning of the "near", "far", and "huge"
keywords specifying pointer types and how these are related to
the various memory models available to C programmers using the
80x86 family of processors used in IBM and compatible PCs and
their successors is explained. A simple bench mark which
illustrates the affect of memory model selection on code size
and execution time is shown. Coding examples show how to use
preprocessor symbols and the #if directive to handle cases
where source code must be modified according the memory model
in use. The compilers used are Microsoft C (MSC) versions 4.0
and 5.0 and Turbo C version 1.5.
Based on an understanding of pointer types and memory
models, confirmed by the results of the bench mark guidelines
for the selection of the best memory model for a program are
given.
ACKNOWLEDGEMENT: Thanks to Jerry Zeisler, who sparked
interest in the subject of this article in a "conversation" on
the C BBS and helped with the bench mark by compiling them
with Turbo C. Thanks also to Barry Lynch, editor of the C
News and sysop of the C BBS for his encouragement, assistance
with file transfers, and running a fine BBS for the discussion
of C related issues.
1. INTRODUCTION
The use of the "near", "far", and "huge" keywords when
declaring pointers and the selection of a memory model for a
program written in C is a problem unique to the 80x86 family
of processors because these are related to the segment:offset
addressing scheme used in this architecture. Before
discussing the advantages and disadvantages of the various
options available, it is useful to briefly describe this
scheme for those not already familiar with the machine
language of the 80x86 architecture. Experienced 80x86
programmers may wish to skip section 1.1, which explains the
various types of pointers, and go directly to 1.2, which
explains memory models. All of the information from sections
1.1 and 1.2 except a few historical asides and other comments
is in the Microsoft C User's Guide.
1.1 80x86 Addresses and Pointer Types
The 80x86 family of processors used in IBM and compatible
PCs are 16 bit processors which are descendents of the 8080 or
its spin-off, the Z80 used in earlier CP/M machines. A 16-bit
machine is so called because its word size is 16 bits.
Usually, but not always, the size of a pointer, word, and
integer are the same. The 80x86 family is one of the
exceptions. A 16 bit word can hold only 2**16 or 64K distinct
addresses. In 80x86 processors, as in most micros and many
larger processors, the unit of memory addressed is a byte.
Page 4
C News Vol. 1 Issue 11 Sept 15, 1988
The address of larger units like words are given by the
address of their first byte, which may be required to be on
certain boundaries such as even numbered addresses or
multiples of the word size. (There are machines which use
word addressing. This has advantages especially for
scientific/engineering "number crunchers". It is not so good
for handling character data.)
When the 8080 and Z80 first came out, memory was much
more expensive and being able to address 64K was thought to be
sufficient. Another consideration was that limiting addresses
to 16 bits made the construction of memories simpler and
cheaper, and early microprocessors were imbedded in other
systems for control purposes and did not need so much memory.
The use of microprocessors for data processing applications in
micro computers came later. The term "Personal Computer" or
PC was not yet in common usage.
As an additional historical note, mainframes of the time
were designed with much larger address spaces, but still small
by the standards of today and the near future. The IBM 360
and 370 which had 32 bit processors only used 24 for
addressing, limiting addressable memory to 16M even for these
large machines. Already some PCs using extended memory have
that much. By contrast, IBM mainframes in use today have the
option of "extended architecture" or XA, using 31 bits for
addresses, and the next wave called "Enterprise System
Architecture" or ESA adds another 12. The amount of storage
which can be addressed by 43 bits is truly immense, 2**43 or
about 8.8e12, more than any main storage we are likely to see
for a long time. Even so, such large address spaces are
actually useful since nearly all mainframes have the hardware
and software to support virtual memory.
When the price of memory came down and the need for a
larger address space became important, but 16-bit
microprocessors were still the norm, designers decided to use
a segmented memory architecture. Segments would contain 64K
bytes each, so the relative position of a byte within a
segment could still be represented by a 16 bit register.
Extra registers were added to address the segments. For
flexibility, segments were allowed to start on any 16 byte
"paragraph" boundary. The 80x86 has registers for addressing
4 segments. They are CS ("code segment"), DS ("data
segment"), SS ("stack segment"), and ES ("extra segment").
The names reflect the way they are normally used. A segment
register gives the address of the first paragraph of a
segment, shifted right 4 bits to fit within a 16 bit word. To
compute an actual address the segment is shifted left 4 bits
to convert it to a byte address and then the offset is added
to address any of the 64K bytes within the segment. Most
programs, whether written in assembly language or a compiled
language take advantage of the registers and make things
cleaner by putting code, data and stack into separate segments
addressed by the registers named for those purposes. (It is
Page 5
C News Vol. 1 Issue 11 Sept 15, 1988
true that the stack contains data, and for that matter code
itself is a kind of data, but the conventional distinctions
are useful.)
Normally such details of machine architecture are only of
concern to the assembly language programmer, but the processor
architecture does influence part of the compiler design. C
programmers who wish to understand the reasons for such design
decisions and in particular architecture specific details of
pointer type and memory model need to understand them.
In machine language it is very convenient if all the
memory referenced lies within a segment whose address is
already loaded in the appropriate register. With the segment
implied, only the 16 bits of the offset must actually be
included in the pointer. Such a pointer is called a "near"
pointer. If, on the other hand, the code or data referenced
does not all lie within a 64K segment, it is necessary to
specify the segment as well as the offset, and a "far" pointer
is required. This is significant not only for space (far
pointers requiring four bytes instead of two), but for
performance. At the machine language level use of far
pointers requires the values of segment registers to be
swapped every time a different segment is accessed. Not only
does an actual pointer take up more space, so does the code to
manipulate it. The extra instructions also increase the
execution time. And this applies not only to explicit pointer
arithemetic, but to array references, sometimes global
variable references and in other situations involving implicit
address calculations.
Far pointers are used for data when a program references
more than 64K of data in total, but it is still convenient if
each array or structure fits within a segment. Then the
segment address used can be selected so that the address of
all elements of the array or structure can be computed without
changing the segment part of the address. If even this
restriction must be removed a "huge" pointer is required.
Huge pointers are four bytes long, just like far pointers, but
arithmetic with huge pointers requires extra steps (and
code). Both huge and far pointers follow the rule, common in
microprocessors, of storing the least signicant byte first.
The first word of a far pointer is the offset and the second
word is the segment. This is important to know if you must
construct a far pointer from its components, or decompose one
into its segment and offset parts. A macro in Listing 1 shows
how to do the former, and the library macros FP_SEG and FP_OFF
do the latter. By the way, the segment and offset are also
each stored least significant byte first, but the
implementation of shifting and arithmetic in C take care of
this for you and you don't need to be concerned about it.
Since offsets into code are not used in this way, the
"huge" keyword applies only to pointers to data, including
array names.
Page 6
C News Vol. 1 Issue 11 Sept 15, 1988
Assembly language programmers must be directly concerned
with considerations such as those above. C programmers have
it a little easier, since the C compiler automatically takes
care of generating addresses and swapping segment registers.
Still, the programmer concerned with efficiency should
understand what is required and control the selection of
pointer types to produce the most efficient code compatible
with other goals such as ease of programming and
maintainability.
1.2 Memory Models
The term "memory model" simply refers to the combination
of defaults for code and data pointers. Though individual
pointers may be explicitly declared "near", "far", or "huge",
the memory model used is very important to program design. It
partly determines the amount of code and/or data a program can
address. In addition, as the bench mark in a later section
shows, the selection of a memory model may have important
implications for the size and efficiency of the generated
code. As a rule, it is better to use the smallest pointer
which will work. Use "near" in preference to "far" and use
"huge" only if absolutely necessary.
In the small memory model, both the code and data are
addressed by near pointers. Small model programs are thus
limited to a total of 64K of code and 64K or data, or a total
of 128K. Most programs fit within this limit, and it is the
most efficient, so it is the default.
Medium model programs use near pointers for data and far
pointers for code. They can therefore have only 64K of data,
but the amount of code is limited only by available memory.
The medium model is preferred by the integrated environment of
Quick C, but is otherwise not often useful for hobbyist
programmers. It takes a rather large program to exceed 64K of
code, and most that do probably also exceed 64K of data and
thus need the large or huge model. However, since references
to data are executed much more frequently than far references
to code the medium model does have quite a performance
advantage over large in those cases where it does fit the
requirements.
Compact model programs use far pointers for data and near
pointers for code. This model is good for programs which
allocate a lot of data, but which have less than 64K of code.
A common example would be a simple editor which stores a whole
file in memory as an array or linked list.
The advantage of the compact model over the large model
is usually less than the advantage of medium over large, but
the choice is almost always between compact and large or
between medium and large, hardly ever between compact and
medium.
Page 7
C News Vol. 1 Issue 11 Sept 15, 1988
Large model programs use far pointers for both data and
code. They can have any amount of code and/or data which will
fit in memory, in any combination. The only restriction is
that individual arrays or structures cannot exceed 64K.
The huge model uses far pointers for code and huge
pointers for data and is thus restricted only by the amount of
storage available. It is also the least efficient, and is
rarely needed.
The tiny memory model, which is an option with Turbo C
but not with Microsoft is similar to small. Both code and
data pointers are near pointers, but, in addition, all
segments are assumed to be the same, that is the total data
and code is restricted to 64K. This might yield smaller
and/or faster code in some cases, if the compiler took
advantage of it. In the simple bench mark given below no
significant difference was found.
Another important design consideration is that the
library routines will assume the default types according to
the memory model in use. Under MSC release 4.0 there is a set
a libraries for each memory model, and the linker
automatically selects the set matching the .OBJ files linked.
MSC 5.0 may be installed with combined libraries, but there
are still separate versions of library routines for each
installed memory model. (Mixing memory models and even more
exotic options are possible, but such advanced topics are not
covered here.)
For example, memcpy() will expect both pointer arguments
to be either near or far pointers, according to the memory
model in use. If it is necessary to use a far pointer to
reference a block of memory to be copied in a program which
otherwise uses near pointers an alternative must be provided,
either in line or by a specially written function which has
different name. The coding example in Listing 1 shows a
simple but realistic case in which this is necessary. The
function cmdargs() needs to build a far pointer to the
unparsed command line arguments in the program segment prefix
and use this to copy the argument string to a buffer supplied
by the calling program. If the source code is compiled using
the small or medium memory model memcpy() cannot be used. In
that case in line code is selected. The decision is made at
compile time by testing the preprocessor variables which
identify the memory model. Since the symbol which tells the
preprocessor that the compact, large, or huge model is in use
is only defined when using MSC the version with in line code,
which will actually work with any memory model is the default
(the #else case.)
2. GUIDELINES FOR MEMORY MODEL SELECTION
Many C programmers find the selection of memory model a
confusing or even mysterious issue. The default small model
Page 8
C News Vol. 1 Issue 11 Sept 15, 1988
is sufficient most of the time, so beginners can put off
having to consider memory models at all. But there comes a
time as programs and/or the quantity of data grow that other
models are necessary. Rather than take the coward's way out
and simply resort to using large or huge all the time, which
some have done, the wise programmer should understand all the
issues and pick the best memory model for the job.
Even in this age of cheap hardware and abundant
resources, it may makes sense to make the best choice you can
to minimize the use of resources. A smaller .EXE file will
obviously load faster, and for many programs load time is
significant, especially if you are loading from a floppy
disk. Also, with 360K floppies keeping the .EXE file size
down may make the difference between being able to keep the
program and data all on one floppy. Looking at it yet another
way, it may make the difference between being able to put a
frequently used program in a RAM disk or having to load from a
hard disk or worse yet a floppy. And needless to say, if you
want to either make your program resident or shell to DOS from
it it is worthwhile to conserve both code and data space. If
nothing else, keeping the code size down leaves more room for
data, and you never know when you may need it.
Most of the time, the choice comes down to selecting the
model which the program requires. The main purpose of this
article is to help users avoid erring on the side of caution
by automatically going to the large model as soon as they run
out of space with small.
Rarely, performance considerations may be so important
that an advance determination of program design for a
particular model is worthwhile, and in that case it is even
more important to have a good idea of the trade offs
involved.
2.1 Determining the Minimum Model Required
Assuming you are not willing to design a program around
the choice of memory model, the problem comes down to
selecting a memory model for a program which is already
designed and possibly coded. As noted in 1.2, the best choice
is the one which uses the smallest pointers which will do.
2.1.1 Code Pointer Requirements
The size of code pointer required is easy to determine
and may constrain the choice of memory model. If a program,
counting all library functions will fit in 64K or less of code
space, use the small or compact model, otherwise medium,
large, or huge.
The code part of most small programs obviously fit in
64K. For extremely large programs it may obviously exceed
that. For anything in between the decision is less clear and
Page 9
C News Vol. 1 Issue 11 Sept 15, 1988
extremely difficult to estimate. Fortunately, the decision is
always a clear go or no go and the linker will tell you.
Unless the program is very big, it is best to start by
compiling all functions using the small or compact model. If
the 64K limit is exceeded the linker will give a clear error
message. (If you ever exceed 64K in a single source file the
compiler would catch that, but shame on you. Modularize!)
Since few if any functions need to be coded differently
to switch to one of the larger models the chances are that all
you will be required to do when and if you find it necessary
will be to recompile all functions using one of the larger
models and relink. If you have a make file for the project
that should be simple indeed.
In those rare instances where it is necessary to modify
source code according to memory model, consider coding so that
you can compile using any memory model. It is almost
inconceivable that the size of the code pointer will be
critical in the source program, so there are really only two
cases to consider, near and far data pointers.
With MSC coding for both possibilities is easy because an
automatically defined preprocessor symbol tells the
preprocessor which model is being used, and this can be used
with the #if directive to select between alternative versions
of the affected parts of the source code. The symbol is
M_I86xM, where "x" is the one character identifier of the
model in use: M_I86SM for small, M_I86MM for medium, M_I86CM
for compact, M_I86LM for large, and M_I86HM for huge. For all
models except huge the symbol for the corresponding model will
be defined and all others being undefined. Huge is a special
case, where both M_I86HM (as expected) and M_I86LM are
defined. Perhaps this is because the huge model is an
extension of the large model.
Listing 1 shows a simple but realistic case where these
symbols are used to select code based on memory model.
Listing 2 is a little more contrived, selecting only a string
to be displayed, but it checks all models. Note that if the
difference between large and huge makes a difference at the
source code level you must not conclude that the large model
is the one in use just because M_I86LM is defined. It could
be that M_I86HM is also defined, indicating huge. That's why
the code in Listing 2 checks M_I86HM before M_I86LM.
The amount of code is fixed. If you are able to get a
clean link you never need worry that a decision to use the
small or compact model will come back to haunt you, and your
resulting .EXE file will be smaller, sometimes much smaller.
Jerry Zeisler, who helped in the preparation of this article
by compiling and linking the bench mark using Turbo C 1.5
reported that when he was forced to go from the small to the
large model for a program the .EXE file went from 71K to
161K. Using either medium or compact according to the
Page 10
C News Vol. 1 Issue 11 Sept 15, 1988
requirements would have made the jump less drastic, but it
does go to shown that once you cross the line from small to
another model you do pay a price in space.
2.1.2 Data Pointer Requirements
Finding the size of data pointer required is not as
clear-cut as determining whether or not a near code pointer
will suffice. The amount of a storage a program will need at
run time cannot be determined in advance by the compiler or
linker in every case. Since C is a semi block structured
language automatic variables are allocated on block entry, and
the total required varies with the depth and order of block
entries. This does not depend only upon the static structure
of your program. It may also depend upon the data each time
you run it. Sometimes you can arrive at a maximum, but for a
program of any complexity it would be a tedious and error
prone process requiring a lot of knowledge of your compiler
implementation. If the program uses recursion it may not even
be possible.
Even when there is no recursion the uncertainty
concerning data space requirements may be a problem in a
program which allocates heap storage using malloc() or similar
functions, since this is even less predictable. This puts a
greater burden on the programmer, and I don't offer any hard
and fast rules here.
If you can determine that 64K of data will always be
sufficient try the small model first, going to medium if
necessary because of the code size. Otherwise use compact if
possible, going to large if the code size requires it.
Use huge only as a last resort, as it is the least
efficient, especially with MSC 4.0. You can almost always
determine ahead of time whether or not any single data item
will exceed 64K, so the choice between large and huge is
usually easy.
3 MEMORY MODEL BENCHMARK
The benefits of using the larger pointer types are
obvious, and amount to a go/no go decision in most cases. For
those cases where performance and/or space is very critical
and the choice of memory model may affect the design in
non-trivial ways, it is good to get an idea ahead of time of
what the costs are as well. The simple benchmark used here
was devised for such a project, where the the design could
take advantage of as much storage as possible, but the
performance of bitor() and similar functions was critical,
since they would be called millions of times each.
3.1 Bench Mark Code
The source code for the very simple bench mark performed
Page 11
C News Vol. 1 Issue 11 Sept 15, 1988
is shown in Listing 2 and Listing 3. Listing 2 defines the
main() function and Listing 2 defines an external function
bitor(), which performs a bitwise or operation between two
memory buffers. The bench mark measures the efficiency of
calling and executing bitor() under various memory models,
which was the problem of interest for the project which
motivated this whole study.
An important reason for compiling the functions
separately, besides the fact that bitor() was actually
intended for use in other programs was that it quaranteed that
no optimizer could could eliminate the repetitive calls.
The main() function accepts parameters which determine how
many times to call bitor() and the size of the buffers, up to
a maximum of 256. A two level nested loop was used simply to
avoid using a long integer counter for more than 64K
repetitions.
Both functions were optimized for speed. This is the
default with MSC, and was used with Turbo C for consistency.
This is usually a wise choice for small programs anyway. In
this case, the bulk of the code comes from the library
routines, and the bulk of the execution is in the compiled
functions. Optimizing the compiled functions for space would
have saved little space, and possibly cost a lot of time. The
usual rule should be to optimize anything seldom executed for
space, and anything frequently executed for time.
3.2 Execution Time Test
When testing for execution time, I used the Rexx program
shown in Listing 4 to set up and time the execution of the
.EXE files prepared under each memory model. In every case,
the .EXE files are copied to the D: the DOS path and is a RAM
disk. This virtually eliminates any variability caused by the
placement of the .EXE files on a hard disk. Two tests are
performed.
Table 1 shows the time in seconds when bitor() is
executed 300,000 times specifying a length of 0, thus
measuring mostly calling overhead. The differences between
memory models is thus mostly related to the type of code
pointer.
Table 2 shows the time to execute bitor() 2,500 times
specifying a length of 256. In that case the execution time
reflects predominantly the indirect memory references in
bitor(), which do the real work take most of the time, so the
primary enfluence is the code pointer type.
The results are not surprising. The small model is the
most efficient, followed medium or compact, depending upon
which test you look at, then large, and finally huge.
Further, in the first test, compact is nearly equal to small
Page 12
C News Vol. 1 Issue 11 Sept 15, 1988
and medium nearly equal to large. In the second this grouping
is reversed. Medium is close to small and compact is close to
large. This confirms the analysis done ahead of time. It
also goes to show again that the relative importance of
different factors affecting performance depends upon not only
the specific program, but sometimes the parameters or other
data as well.
One thing which is surprising at first is that although
MSC 4.0 and 5.0 are generally quite close, 4.0 shows a (pardon
the pun) huge penalty for using the huge model in the second
test. This is probably because the huge model was a new
feature with that release, and by the time release 5.0 came
out developers had had more chance to optimize it.
3.3 Code Size Compared.
Table 3 lists the size of the .OBJ and .EXE files
produced by each compiler with each memory model. The files
have been renamed according to their respective memory
models. The results are mostly self explanatory. The size of
the .EXE files must be taken with a half a grain of salt,
since they consist mostly of library routines, which may not
have even been written in C, and don't necessarily shown the
quality of the compiler.
3.4 Conclusions
For each compiler, the time and code space efficiency of
the various memory models compare to one another exactly as
our theoretical explanation predicts. That is that the small
model is the most efficient and should be used in those cases
where it will serve the purpose. These tests show no
advantage of the tiny model over the small model.
Medium and compact are both between small and large, but
can't be strictly ordered. The relative effeciency of these
two depends upon the individual program and data. In any
case, the programmer is seldom faced with the choice between
medium and compact.
The large model is less efficient than small, medium, or
compact, though the difference between it and either medium or
compact may not be significant. When far code references
predominate medium is close to large and compact is close to
small. When data references predominate the situation is
reversed. The latter case is the most common in practice.
The huge model is the least efficient. The penalty for
going from large to huge is quite severe with MSC 4.0, less
with 5.0, and almost insignificant for Turbo C, a real tribute
to the optimization of Turbo C.
Caution is always in order when using bench marks to
compare different vendor's program products, especially
Page 13
C News Vol. 1 Issue 11 Sept 15, 1988
compilers. It is often easy to devise a test to make one's
choice come out on top. Contradictory advertising claims
suggest this is in fact what vendors do. The bench mark shown
here is highly selective, in that it aims to isolate certain
features of interest. It does not use any floating point
operations, recursion, or complex calculations of any kind,
and does not do any significant amount of i/o, for example.
Still, it does measure the things of interest here rather well
and was not written with the purpose of proving a given
compiler better or worse.
It is therefore worth noting, without drawing dogmatic
conclusions, that, contrary to the claims of Microsoft when
pushing upgrades to 5.0, version 4.0 sometimes produces better
object code. In fact, for actual applications, I have hardly
ever found a case where recompiling something with MSC 5.0
yielded a smaller or significantly faster .EXE file than I
previously had gotten from 4.0.
MSC 5.0 introduced a lot of new functions, but if you
don't need them and are not using the huge model you may do
better to continue using 4.0. I have also found version 4.0
to be a much more reliable product. I only report my
experience. Perhaps my applications are not representative.
I never use floating point math but use recursion fairly
often, for example.
So many bugs were reported with 5.0 that Microsoft rather
quickly announced 5.1. I did not have 5.1 available for test
because I had such a bad experience with 5.0 that I didn't
feel like paying another upgrade fee to fix their bugs,
preferring to spend about the same amount of money for Turbo
C, if it came to that. The results of this limited bench mark
seems to strengthen that resolve. In every case, Turbo C
produced tighter, faster object code, a rather impressive
achievement considering the price differential.
Page 14
C News Vol. 1 Issue 11 Sept 15, 1988
=================================================================
/* Listing 1: CMDARGS.C */
/* get unparsed command line arguments from PSP */
/* sets input variable to the line and returns length */
#include <stdlib.h>
#include <string.h>
#include <dos.h>
#define FP_PTR(seg,off) ((((long)seg)<<16)+off)
int cmdargs(result) char *result;
{
unsigned char far *dta=FP_PTR(_psp,0x80);
/* if compact, large or huge use memcpy */
#if defined(M_I86LM) || defined(M_I86CM)
memcpy(result,dta+1,*dta);
result[*dta]=0;
return *dta;
#else
{
int length=*dta;
int ret_len=length;
while (length--)
*(result++)=*(dta++);
*result=0;
return ret_len;
}
#endif
}
#if defined(TEST)
#include <stdio.h>
main()
{
char args[128];
cmdargs(args);
putchar('"');
fputs(args,stdout);
putchar('"');
}
#endif
Page 15
C News Vol. 1 Issue 11 Sept 15, 1988
=================================================================
/* LISTING 2 - TEST.C */
#include <stdio.h>
void bitor(char *, char *, int);
/* Use preprocessor symbols to determine
content of string model[] */
static char model[8]=
#if defined(M_I86SM)
"small";
#elif defined(M_I86MM)
"medium";
#elif defined(M_I86CM)
"compact";
#elif defined(M_I86HM)
/* NOTE huge must be tested before large,
because huge sets M_I86LM as well as M_I86HM */
"huge";
#elif defined(M_I86LM)
"large";
#else
"unknown"; /* non-standard, (or Turbo C) */
#endif
main(argc, argv) int argc; char **argv;
{
char buf1[256], buf2[256];
int i=0, j, jlim=0, len=sizeof(buf1);
/* i=outer loop count; j=inner loop count; defaults 0 0 */
switch (argc)
{
case 4:
len=atoi(argv[3]);
if (len>sizeof(buf1)) len=sizeof(buf1);
case 3:
jlim=atoi(argv[2]);
case 2:
i=atoi(argv[1]);
}
printf("model=%s i=%d j=%d len=%d\n",model,i,jlim,len);
while (i--)
for (j=jlim; j; j--)
bitor(buf1,buf2,len);
}
Page 16
C News Vol. 1 Issue 11 Sept 15, 1988
=================================================================
/* LISTING 3 - BITOR.C */
/* Perform bitwise or between two buffers */
void bitor(x,y,len) char *x, *y; int len;
{while (len--) *(x++)|=*(y++);}
=================================================================
/* Listing 4: TIMETEST.REX */
source='MSC4 MSC5 TURBOC'
parms.1=10 30000 0
parms.2=1 2500 256
models='S M C L H' /* Tiny model tested separately */
do ii=2 to words(source)
s=word(source,ii)
copy '\'s'\*.exe d:'
outfile=s'.DAT'
do j=1 to 2
do i=1 to words(models)
m=word(models,i)
/* Here is the key part: Execute and record time */
call time r
'TEST'm parms.j
time.j.m=time(e)
end
end
do i=1 to words(models)
m=word(models,i)
data=m time.1.m time.2.m
say data
call lineout outfile, data
end
end
exit
Page 17
C News Vol. 1 Issue 11 Sept 15, 1988
=================================================================
Table 1: Speed Test - Function Calls
Model MSC 4.0 MSC 5.0 Turbo C 1.5
------- ------- ------- -----------
Tiny 25.10
Small 35.32 34.27 25.10
Medium 42.13 41.58 27.30
Compact 35.54 34.43 21.42
Large 42.29 41.63 23.07
Huge 43.88 41.63 25.98
=================================================================
Table 2: Speed Test - Indirect Byte Reference
Model MSC 4.0 MSC 5.0 Turbo C 1.5
------- ------- ------- -----------
Tiny 18.89
Small 32.19 32.19 18.90
Medium 32.29 32.24 18.95
Compact 35.92 35.86 30.92
Large 35.98 35.93 30.98
Huge 68.88 41.19 31.03
Page 18
C News Vol. 1 Issue 11 Sept 15, 1988
=================================================================
Table 3 - Comparing .OBJ and .EXE File Size
MSC MSC Turbo C
File 4.0 4.0 1.5
-----------
bitort.obj * * 194
bitors.obj 309 287 192
bitorm.obj 316 294 197
bitorc.obj 309 285 196
bitorl.obj 316 292 201
bitorh.obj 381 326 182
testt.obj * * 473
tests.obj 541 521 473
testm.obj 557 537 487
testc.obj 560 541 495
testl.obj 576 557 509
testh.obj 653 636 485
testt.exe * * 6534
tests.exe 6670 7383 6334
testm.exe 6870 7531 6476
testc.exe 8770 9501 7898
testl.exe 8970 9649 8056
testh.exe 9082 9729 9143
* Tiny model not applicable to MSC.
Page 19
C News Vol. 1 Issue 11 Sept 15, 1988
Public Domain Software Review: ROFF By Barry Lynch
Program: ROFF4.ARC
Purpose: A MS-DOS implementation of ROFF a UNIX utility.
Provides document formatting and typsetting.
A friend of mine suggested that I use ROFF to produce C
News. Previously I had mentioned the hassle involved with
generating each edition of C News. So, I picked up a copy of
ROFF4 and tried it out.
This edition of ROFF4 is a 1984 version that was
originally written by Ernest bergmann of Lehigh University. C
source code is provided as well as 19 pages of documentation.
This issue of C News was generated with this package. The
output is excellent and the utility is fairly easy to use, if
you read the documentation carefully.
However, there are some problems that need to be
addressed. One is the documentation. It does not clearly
spell out the functionality of each option. I had difficulty
getting certain lines NOT to right justify, even after putting
a non-right justify flag in place. Also, the src code example
included with one of the articles was mangled. I had to cut
and paste a new version of src in.
All in all this is not a bad utility. The problems that
I experienced could be to my unfamiliarity with ROFF. I will
be spending the next couple of weeks working with it to make
sure that it can indeed do the job I need it to do. Which is
create C News automatically.
* Editor's Note: I will be working on enhanceing this product
in the coming months. Specifically adding
on-line help and cleaning up the documentation
with examples.
Page 20
C News Vol. 1 Issue 11 Sept 15, 1988
FILECHK version 1.0 by Arnold Cherdak, all rights reserved
Disk File Integrity Checker: Reads all files on a disk or in
one or more (sub)directories to verify that the directory
entry for each file is correct and that the file is readable.
Usage: FILECHK [-D] [-H] [-?] [directoryname]
Default operation (without command line entries) causes
FILECHK to read all files in the current directory and all files
in subdirectories below the current directory.
-D option causes FILECHK to process files in the current
directory,
only.
-H and -? options provide the help screen (this one).
The presence of a string, 'directoryname', causes FILECHK to
begin processing in the directory having the name,
'directoryname'.
This program was written in response to a friend's plea for
help. He runs several PC networks which have been in
operation for quite some time. Occasionally, one of his
installations receives a near hit by lightning. Needless to
say, this scrambles the computers' brains and creates problems
which don't show up right away in addition to those that
announce themselves with furious noises and billows of smoke
-- surge protectors, notwithstanding.
FILECHK reads all the files on the disk or in a directory or
a directory and its subdirectories. If the files can be read
and the size as read agrees with the directory entry, then
it's a pretty good bet that the file is usable. Friend had
several cases where directories were OK and the files had been
clobbered and effects on network operation weren't readily
discernable all the time. After much searching among HUNDREDS
of files, he finally found the few bad files. FILECHK will
now do it for him faster and more completely than he can do
it.
If you want a record of FILECHK's output, you may redirect
it to your printer or to a file using the normal DOS
redirection since all output goes to stdout. Example:
FILECHK -d
or FILECHK -D utilities >prn
I've included the source code in Turbo C, version 1.5. I
began to do this since the virus problem became so acute.
With the source, you can compile the program yourself and be
certain that there will be no surprises in store for you when
Page 21
C News Vol. 1 Issue 11 Sept 15, 1988
you run it. I've run FILECHK in the root directory of my 30
megabyte drive in my own computer. It read over 25 megabytes,
about 1600 files in a veritable rat's nest of subdirectories
in about 5 minutes. It would be a bit faster when writing to
a file, a bit slower when writing to the printer. FILECHK
found no problems. I ran CHKDSK to compare and it found 3
lost clusters. In the past, I've found that lost clusters
don't always mean that the files are lost. In fact, I've been
able to make the lost clusters disappear by moving files off
or around the disk. The files copied all right and
afterwards, CHKDSK could find nothing wrong. Stranger than
fiction.
Arnie Cherdak
8/14/88
Page 22
C News Vol. 1 Issue 11 Sept 15, 1988
ARTICLE SUBMISSION STANDARDS AND ADDRESSES
As I have repeatedly stated in this newsletter and
previous issues, I would like to see user-submitted articles,
reviews or questions. Listed below are the standards that
should be followed to make my job easier as an editor.
- Articles should be submitted in a ASCII non-formatted
file. (Margins 0-65 PLEASE)
- If the article include code fragments as examples. Then
you can include the entire source file if you like for
inclusion with the newsletter.
- Book or magazine reviews should follow the same format,
that is outlined in this issue. The publisher, author,
title, and ISBN number are a must.
- Compiler/and or product reviews, should include the
version number and manufacture. If possible, reviews
should include a sample program with benchmarks.
If you have any questions you can contact me at the
address's included on the next page.
Page 23
C News Vol. 1 Issue 11 Sept 15, 1988
ADDRESSES
The C BBS is located at:
C BBS
% BCL Limited
P.O. Box 9162
McLean VA, 22102
or you can send netmail to:
1:109/713
Page 24
C News Vol. 1 Issue 11 Sept 15, 1988
INDEX
Subject: Issue:
Articles:
Additional Comments of Filename Wild.. 6
Beginning C Functions 7
C Spot Run: A User Supported Library 7
Database Design in C 10
Filechk 11
Filename Wildcard Expansion in MSC 4
Integrated Environment: TC & QC 5
Memory Models 11
Programming the Hercules Graphics Card 8
Programming the Hercules Graphics Card: Part 2 9
Talking with a Fossil 5
TurboC and Interrupts: A few Questions 2
Book Reviews:
C Chest: and other treasures. 6
C Database Development 1
C Programming Guide 1
C Programming Language 1
C Programmer's Guide to Serial Communications 3
C Programmer's Library 1
C Primer Plus 1
C the Complete Reference 2
Crafting C Tools for the IBM PC 2
Learning to Program in C 1
Microsoft C Programming on the IBM PC 1
MS-DOS Developer's Guide 4
Programming in Windows 3
QuickC Programming for the IBM 10
Reliable Data Structures in C 1
TurboC: Memory Resident Utilities 5
TurboC Programmer's Reference Book 2
Compilers:
QuickC 1
Software Reviews:
Bplus11.arc 3
C_Dates.arc 4
Cdate.arc 4
Casm.arc 3
C-subr.arc 4
Docu.arc 3
Ezwind.arc 8
Jcl-src.arc 4
Page 25
C News Vol. 1 Issue 11 Sept 15, 1988
Mscpopup.arc 3
Ndmake41.arc 4
Nuc-subr.arc 3
Prndoc.arc 6
Sed.arc 6
Shift_c.arc 4
Sysact11.arc 4
Tp_to_qc.arc 3
Xenixarc.arc 4
Page 26
C News Vol. 1 Issue 11 Sept 15, 1988
DISTRIBUTION POINTS
Board Name Number Net/Node Sysop
United States
C BBS (703) 644-6478 1:109/713 Barry Lynch
Burke, VA
Jaz C-Scape (904) 724-1377 1:112/1027 Tom Evans
Jacksonville, FL
Eastern C Board (201) 247-6748 1:107/335 Todd Lehr
OTHER BOARDS THAT CARRY C NEWS:
Exec-PC (414) 964-5160 .. Bob Mahoney
Milwaukee, WI
TAMIAMI (813) 793-2392 Gerhard Barth
Naples, FL
Sound of Music (516) 536-8723(2400) Paul Waldinger
(516) 536-6819(9600 Hayes V)
Canada
Another BBS System (416) 465-7752 1:148/208 Mark Bowman
Toronto, Canada
Europe
Fido_N1_1 31-8350-37156 2:500/1 Henk Wevers
The Netherlands
Australia
Sentry BBS 02-428-4687 ... Trev Roydhouse
(300-2400) Non-Mail Times
(300-19,200) Mail Hour (Trailblazer)
The Sentry BBS replaces Alpha-Centuri that has been down for
awhile.
Page 27
C News Vol. 1 Issue 11 Sept 15, 1988
USER RESPONSE FORM
This form will be included as a regular feature in all future
issues of C NEWS.
What did you think of the content of this Issue? _____________
_______________________________________________________________
What improvements can you think of that would make C News a
better tool for the C Community?
_______________________________________________________________
_______________________________________________________________
What is your favorite section or sections? ___________________
_______________________________________________________________
What don't you like about C News? ____________________________
_______________________________________________________________
Additional Comments: _________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
Page 28